{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "c006eb1c-3cbd-42ff-9198-537898f496a5",
   "metadata": {},
   "source": [
    "# Create an Agentic RAG using OpenVINO and LlamaIndex\n",
    "\n",
    "An **agent** is an automated reasoning and decision engine. It takes in a user input/query and can make internal decisions for executing that query in order to return the correct result. The key agent components can include, but are not limited to:\n",
    "\n",
    "- Breaking down a complex question into smaller ones\n",
    "- Choosing an external Tool to use + coming up with parameters for calling the Tool\n",
    "- Planning out a set of tasks\n",
    "- Storing previously completed tasks in a memory module\n",
    "\n",
    "[LlamaIndex](https://docs.llamaindex.ai/en/stable/) is a framework for building context-augmented generative AI applications with LLMs.LlamaIndex imposes no restriction on how you use LLMs. You can use LLMs as auto-complete, chatbots, semi-autonomous agents, and more. It just makes using them easier. You can build agents on top of your existing LlamaIndex RAG pipeline to empower it with automated decision capabilities. A lot of modules (routing, query transformations, and more) are already agentic in nature in that they use LLMs for decision making.\n",
    "\n",
    "**Agentic RAG = Agent-based RAG implementation**\n",
    "\n",
    "While standard RAG excels at simple queries across a few documents, agentic RAG takes it a step further and emerges as a potent solution for question answering. It introduces a layer of intelligence by employing AI agents. These agents act as autonomous decision-makers, analyzing initial findings and strategically selecting the most effective tools for further data retrieval. This multi-step reasoning capability empowers agentic RAG to tackle intricate research tasks, like summarizing, comparing information across multiple documents and even formulating follow-up questions -all in an orchestrated and efficient manner.\n",
    "\n",
    "![agentic-rag](https://github.com/openvinotoolkit/openvino_notebooks/assets/91237924/871cb90d-27fd-4a87-aa3c-f4cdb199a148)\n",
    "\n",
    "This example will demonstrate using RAG engines as a tool in an agent with OpenVINO and LlamaIndex.\n",
    "\n",
    "#### Table of contents:\n",
    "\n",
    "- [Prerequisites](#Prerequisites)\n",
    "- [Download models](#Download-models)\n",
    "  - [Download LLM](#Download-LLM)\n",
    "  - [Download Embedding model](#Download-Embedding-model)\n",
    "- [Create models](#Create-models)\n",
    "  - [Create OpenVINO LLM](#Create-OpenVINO-LLM)\n",
    "  - [Create OpenVINO Embedding](#Create-OpenVINO-Embedding)\n",
    "- [Create tools](#Create-tools)\n",
    "- [Run Agentic RAG](#Run-Agentic-RAG)\n",
    "\n",
    "### Installation Instructions\n",
    "\n",
    "This is a self-contained example that relies solely on its own code.\n",
    "\n",
    "We recommend  running the notebook in a virtual environment. You only need a Jupyter server to start.\n",
    "For details, please refer to [Installation Guide](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/README.md#-installation-guide).\n",
    "\n",
    "<img referrerpolicy=\"no-referrer-when-downgrade\" src=\"https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/llm-agent-rag-llamaindex/llm-agent-rag-llamaindex.ipynb\" />\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "377b6144-99c7-4af8-938b-b6fdbca652d7",
   "metadata": {},
   "source": [
    "## Prerequisites\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Install required dependencies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0368a813",
   "metadata": {},
   "outputs": [],
   "source": [
    "# to avoid conflicts between versions\n",
    "%pip uninstall -q -y optimum optimum-intel optimum-onnx"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9aa478c3-8110-4801-b249-fd3b651dc60b",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import requests\n",
    "from pathlib import Path\n",
    "\n",
    "utility_files = [\"notebook_utils.py\", \"cmd_helper.py\", \"pip_helper.py\"]\n",
    "\n",
    "for utility in utility_files:\n",
    "    local_path = Path(utility)\n",
    "    if not local_path.exists():\n",
    "        r = requests.get(\n",
    "            url=f\"https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/{local_path.name}\",\n",
    "        )\n",
    "        with local_path.open(\"w\") as f:\n",
    "            f.write(r.text)\n",
    "\n",
    "os.environ[\"GIT_CLONE_PROTECTION_ACTIVE\"] = \"false\"\n",
    "\n",
    "from pip_helper import pip_install\n",
    "\n",
    "pip_install(\n",
    "    \"-q\",\n",
    "    \"--extra-index-url\",\n",
    "    \"https://download.pytorch.org/whl/cpu\",\n",
    "    \"nncf>=2.18.0\",\n",
    "    \"llama-index\",\n",
    "    \"llama-index-readers-file\",\n",
    "    \"llama-index-core\",\n",
    "    \"torch==2.8\",\n",
    "    \"transformers==4.53.3\",\n",
    "    \"torchvision==0.23.0\",\n",
    "    \"llama-index-llms-huggingface>=0.4.0,<0.5.0\",\n",
    "    \"llama-index-embeddings-huggingface>=0.4.0,<0.5.0\",\n",
    "    \"huggingface-hub>=0.26.5\",\n",
    ")\n",
    "pip_install(\"-q\", \"git+https://github.com/huggingface/optimum-intel.git\", \"datasets<4.0.0\", \"accelerate\")\n",
    "pip_install(\"--pre\", \"-Uq\", \"openvino>=2025.3.0\", \"--extra-index-url\", \"https://storage.openvinotoolkit.org/simple/wheels/nightly\")\n",
    "pip_install(\"--pre\", \"-Uq\", \"openvino-tokenizers[transformers]\", \"--extra-index-url\", \"https://storage.openvinotoolkit.org/simple/wheels/nightly\")\n",
    "pip_install(\"-q\", \"--no-deps\", \"llama-index-llms-openvino\", \"llama-index-embeddings-openvino\")\n",
    "\n",
    "# Read more about telemetry collection at https://github.com/openvinotoolkit/openvino_notebooks?tab=readme-ov-file#-telemetry\n",
    "from notebook_utils import collect_telemetry\n",
    "\n",
    "collect_telemetry(\"llm-agent-rag-llamaindex.ipynb\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6b19fb6e-e3f4-43e6-b7f1-5bbf2719fa85",
   "metadata": {},
   "outputs": [],
   "source": [
    "from notebook_utils import download_file\n",
    "\n",
    "\n",
    "text_example_en_path = Path(\"text_example_en.pdf\")\n",
    "text_example_en = \"https://github.com/user-attachments/files/16171326/xeon6-e-cores-network-and-edge-brief.pdf\"\n",
    "\n",
    "\n",
    "download_file(text_example_en, text_example_en_path)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "fd371ab3",
   "metadata": {},
   "source": [
    "## Download models\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "### Download LLM\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "To run LLM locally, we have to download the model in the first step. It is possible to [export your model](https://github.com/huggingface/optimum-intel?tab=readme-ov-file#export) to the OpenVINO IR format with the CLI, and load the model from local folder.\n",
    "\n",
    "Large Language Models (LLMs) are a core component of agent. LlamaIndex does not serve its own LLMs, but rather provides a standard interface for interacting with many different LLMs. In this example, we can select `Phi3-mini-instruct` or `Meta-Llama-3-8B-Instruct` as LLM in agent pipeline.\n",
    "* **phi3-mini-instruct** - The Phi-3-Mini is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. More details about model can be found in [model card](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct), [Microsoft blog](https://aka.ms/phi3blog-april) and [technical report](https://aka.ms/phi3-tech-report).\n",
    "* **llama-3.1-8b-instruct** - The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. More details about model can be found in [Meta blog post](https://ai.meta.com/blog/meta-llama-3-1/), [model website](https://llama.meta.com) and [model card](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct).\n",
    ">**Note**: run model with demo, you will need to accept license agreement. \n",
    ">You must be a registered user in 🤗 Hugging Face Hub. Please visit [HuggingFace model card](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), carefully read terms of usage and click accept button.  You will need to use an access token for the code below to run. For more information on access tokens, refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).\n",
    ">You can login on Hugging Face Hub in notebook environment, using following code:\n",
    " \n",
    "```python\n",
    "    ## login to huggingfacehub to get access to pretrained model \n",
    "\n",
    "    from huggingface_hub import notebook_login, whoami\n",
    "\n",
    "    try:\n",
    "        whoami()\n",
    "        print('Authorization token already provided')\n",
    "    except OSError:\n",
    "        notebook_login()\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "fa272a75-f2ec-4846-8c0e-b187103ab701",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "d758f49f5adf4590908df788060eca0a",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Dropdown(description='Model:', options=('OpenVINO/Phi-3-mini-4k-instruct-int4-ov', 'meta-llama/Meta-Llama-3.1-…"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import ipywidgets as widgets\n",
    "\n",
    "llm_model_ids = [\"OpenVINO/Phi-3-mini-4k-instruct-int4-ov\", \"meta-llama/Meta-Llama-3.1-8B-Instruct\"]\n",
    "\n",
    "llm_model_id = widgets.Dropdown(\n",
    "    options=llm_model_ids,\n",
    "    value=llm_model_ids[0],\n",
    "    description=\"Model:\",\n",
    "    disabled=False,\n",
    ")\n",
    "\n",
    "llm_model_id"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "86fdc4ba-74c4-4869-898e-131f47827e8f",
   "metadata": {
    "test_replace": {}
   },
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "import huggingface_hub as hf_hub\n",
    "from cmd_helper import optimum_cli\n",
    "\n",
    "llm_model_path = llm_model_id.value.split(\"/\")[-1]\n",
    "repo_name = llm_model_id.value.split(\"/\")[0]\n",
    "\n",
    "if not Path(llm_model_path).exists():\n",
    "    if repo_name == \"OpenVINO\":\n",
    "        hf_hub.snapshot_download(llm_model_id.value, local_dir=llm_model_path)\n",
    "    else:\n",
    "        optimum_cli(llm_model_id.value, llm_model_path, additional_args={\"task\": \"text-generation-with-past\", \"weight-format\": \"int4\"})"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "cd7b7a18",
   "metadata": {},
   "source": [
    "### Download Embedding model\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Embedding model is another key component in RAG pipeline. It takes text as input, and return a long list of numbers used to capture the semantics of the text. An OpenVINO embedding model and tokenizer can be exported by `feature-extraction` task with `optimum-cli`. In this tutorial, we use [bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "a5ba0440",
   "metadata": {},
   "outputs": [],
   "source": [
    "embedding_model_id = \"BAAI/bge-small-en-v1.5\"\n",
    "embedding_model_path = \"bge-small-en-v1.5\"\n",
    "\n",
    "if not Path(embedding_model_path).exists():\n",
    "    optimum_cli(embedding_model_id, embedding_model_path, additional_args={\"task\": \"feature-extraction\"})"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "157eca4c",
   "metadata": {},
   "source": [
    "## Create models\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "### Create OpenVINO LLM\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Select device for LLM model inference"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bca3764d",
   "metadata": {},
   "outputs": [],
   "source": [
    "from notebook_utils import device_widget\n",
    "\n",
    "llm_device = device_widget(\"CPU\", exclude=[\"NPU\"])\n",
    "\n",
    "llm_device"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "23e3f28b",
   "metadata": {},
   "source": [
    "OpenVINO models can be run locally through the `OpenVINOLLM` class in [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/openvino/). If you have an Intel GPU, you can specify `device_map=\"gpu\"` to run inference on it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "id": "3c259c61-5eef-41a8-a9f7-462f27d0c7d4",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Compiling the model to CPU ...\n"
     ]
    }
   ],
   "source": [
    "from llama_index.llms.openvino import OpenVINOLLM\n",
    "\n",
    "import openvino.properties as props\n",
    "import openvino.properties.hint as hints\n",
    "import openvino.properties.streams as streams\n",
    "\n",
    "\n",
    "ov_config = {hints.performance_mode(): hints.PerformanceMode.LATENCY, streams.num(): \"1\", props.cache_dir(): \"\"}\n",
    "\n",
    "\n",
    "def phi_completion_to_prompt(completion):\n",
    "    return f\"<|system|><|end|><|user|>{completion}<|end|><|assistant|>\\n\"\n",
    "\n",
    "\n",
    "def llama3_completion_to_prompt(completion):\n",
    "    return f\"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\\n\\n<|eot_id|><|start_header_id|>user<|end_header_id|>\\n\\n{completion}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n\"\n",
    "\n",
    "\n",
    "llm = OpenVINOLLM(\n",
    "    model_id_or_path=str(llm_model_path),\n",
    "    context_window=3900,\n",
    "    max_new_tokens=1000,\n",
    "    model_kwargs={\"ov_config\": ov_config},\n",
    "    generate_kwargs={\"do_sample\": False, \"temperature\": None, \"top_p\": None},\n",
    "    completion_to_prompt=phi_completion_to_prompt if llm_model_path == \"Phi-3-mini-4k-instruct-int4-ov\" else llama3_completion_to_prompt,\n",
    "    device_map=llm_device.value,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "41820eb6",
   "metadata": {},
   "source": [
    "### Create OpenVINO Embedding\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Select device for embedding model inference"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "id": "6e41705e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4d5021c768544adfb86bd5106923f9cc",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Dropdown(description='Device:', options=('CPU', 'AUTO'), value='CPU')"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "embedding_device = device_widget()\n",
    "\n",
    "embedding_device"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "abd61822",
   "metadata": {},
   "source": [
    "A Hugging Face embedding model can be supported by OpenVINO through [`OpenVINOEmbeddings`](https://docs.llamaindex.ai/en/stable/examples/embeddings/openvino/) class of LlamaIndex."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "d3448c9f",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Compiling the model to CPU ...\n"
     ]
    }
   ],
   "source": [
    "from llama_index.embeddings.huggingface_openvino import OpenVINOEmbedding\n",
    "\n",
    "embedding = OpenVINOEmbedding(model_id_or_path=embedding_model_path, device=embedding_device.value)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "4f7d4971",
   "metadata": {},
   "source": [
    "## Create tools\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "In this examples, we will create 2 customized tools for `multiply` and `add`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "f594cf18-8100-4207-9ec0-7ded996e85e3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.agent import ReActAgent\n",
    "from llama_index.core.tools import FunctionTool\n",
    "\n",
    "\n",
    "def multiply(a: float, b: float) -> float:\n",
    "    \"\"\"Multiply two numbers and returns the product\"\"\"\n",
    "    return a * b\n",
    "\n",
    "\n",
    "multiply_tool = FunctionTool.from_defaults(fn=multiply)\n",
    "\n",
    "\n",
    "def divide(a: float, b: float) -> float:\n",
    "    \"\"\"Add two numbers and returns the sum\"\"\"\n",
    "    return a / b\n",
    "\n",
    "\n",
    "divide_tool = FunctionTool.from_defaults(fn=divide)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "3f05a4ab",
   "metadata": {},
   "source": [
    "To demonstrate using RAG engines as a tool in an agent, we're going to create a very simple RAG query engine as one of the tools. \n",
    "\n",
    ">**Note**: For a full RAG pipeline with OpenVINO, you can check the [RAG notebooks](../llm-rag-llamaindex)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "eea245b9-73c5-431e-af47-3e676888bd5f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import SimpleDirectoryReader\n",
    "from llama_index.core import VectorStoreIndex, Settings\n",
    "\n",
    "Settings.embed_model = embedding\n",
    "Settings.llm = llm\n",
    "\n",
    "reader = SimpleDirectoryReader(input_files=[text_example_en_path])\n",
    "documents = reader.load_data()\n",
    "index = VectorStoreIndex.from_documents(\n",
    "    documents,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "0d807600",
   "metadata": {},
   "source": [
    "Now we turn our query engine into a tool by supplying the appropriate metadata (for the python functions, this was being automatically extracted so we didn't need to add it):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "id": "9b8cd9c9-a595-4baf-9adc-77f740f19f1f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.tools import QueryEngineTool, ToolMetadata\n",
    "\n",
    "vector_tool = QueryEngineTool(\n",
    "    index.as_query_engine(streaming=True),\n",
    "    metadata=ToolMetadata(\n",
    "        name=\"vector_search\",\n",
    "        description=\"Useful for searching for basic facts about 'Intel Xeon 6 processors'\",\n",
    "    ),\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "40cdfcbc",
   "metadata": {},
   "source": [
    "## Run Agentic RAG\n",
    "\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "We modify our agent by adding this engine to our array of tools (we also remove the llm parameter, since it's now provided by settings):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "id": "c8aefd1d-be3c-46f9-bd67-5c8557f9b385",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent = ReActAgent.from_tools([multiply_tool, divide_tool, vector_tool], llm=llm, verbose=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "ed0033ed",
   "metadata": {},
   "source": [
    "Ask a question using multiple tools."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "id": "cbf386c9-f74e-4948-9ea0-94b69b7b2e29",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "> Running step ee829c21-5642-423d-afcf-27e894aede35. Step input: What's the maximum number of cores of 8 sockets of 'Intel Xeon 6 processors' ? Go step by step, using a tool to do any math.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1;3;38;5;200mThought: The current language of the user is English. I need to use a tool to help me answer the question.\n",
      "Action: vector_search\n",
      "Action Input: {'input': 'Intel Xeon 6 processors'}\n",
      "\u001b[0m"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1;3;34mObservation: According to the provided text, Intel Xeon 6 processors with Efficient-cores are described as having the following features and benefits:\n",
      "\n",
      "* Up to 144 cores per socket in 1- or 2-socket configurations, boosting processing capacity, accelerating service mesh performance, and decreasing transaction latency.\n",
      "* Improved power efficiency and lower idle power ISO configurations, contributing to enhanced sustainability with a TDP range of 205W-330W.\n",
      "* Intel QuickAssist Technology (Intel QAT) drives fast encryption/key protection, while Intel Software Guard Extensions (Intel SGX) and Intel Trust Domain Extensions (Intel TDX) enable confidential computing for regulated workloads.\n",
      "* Intel Xeon 6 processor-based platforms with Intel Ethernet 800 Series Network Adapters set the bar for maximum 5G core workload performance and lower operating costs.\n",
      "\n",
      "These processors are suitable for various industries, including:\n",
      "\n",
      "* Telecommunications: 5G core networks, control plane (CP), and user plane functions (UPF)\n",
      "* Enterprise: Network security appliances, secure access service edge (SASE), next-gen firewall (NGFW), real-time deep packet inspection, antivirus, intrusion prevention and detection, and SSL/TLS inspection\n",
      "* Media and Entertainment: Content delivery networks, media processing, video on demand (VOD)\n",
      "* Industrial/Energy: Digitalization of automation, protection, and control\n",
      "\n",
      "The processors are also mentioned to be suitable for various use cases, including:\n",
      "\n",
      "* 5G core networks\n",
      "* Network security appliances\n",
      "* Content delivery networks\n",
      "* Media processing\n",
      "* Video on demand (VOD)\n",
      "* Digitalization of automation, protection, and control in industrial and energy sectors\n",
      "\u001b[0m> Running step c8d3f8b5-0a3e-4254-87a8-c13cd4f992ad. Step input: None\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1;3;38;5;200mThought: The current language of the user is English. I need to use a tool to help me answer the question.\n",
      "Action: multiply\n",
      "Action Input: {'a': 8, 'b': 144}\n",
      "\u001b[0m\u001b[1;3;34mObservation: 1152\n",
      "\u001b[0m> Running step 437a7fcf-7f53-4d7c-b3d4-06b2714a1b9d. Step input: None\n",
      "\u001b[1;3;38;5;200mThought: The current language of the user is English. I can answer without using any more tools. I'll use the user's language to answer.\n",
      "Answer: The maximum number of cores of 8 sockets of 'Intel Xeon 6 processors' is 1152.\n",
      "\u001b[0m"
     ]
    }
   ],
   "source": [
    "response = agent.chat(\"What's the maximum number of cores of 8 sockets of 'Intel Xeon 6 processors' ? Go step by step, using a tool to do any math.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "fd3211e2-4465-41e1-bbd6-b493a53ffccc",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.reset()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  },
  "openvino_notebooks": {
   "imageUrl": "https://github.com/openvinotoolkit/openvino_notebooks/assets/91237924/871cb90d-27fd-4a87-aa3c-f4cdb199a148",
   "tags": {
    "categories": [
     "Model Demos",
     "AI Trends"
    ],
    "libraries": [],
    "other": [
     "LLM"
    ],
    "tasks": [
     "Text Generation"
    ]
   }
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
